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Abstract 

A sequential multiple testing procedure recently introduced by Heinrich, Bach and 
Kornmeier allows to "zoom in" on, and thus identify regions with highly significant 
departures from null- hypotheses. The purpose of this note is to state a cognate of 
this procedure in general form and to prove that it controls the familywise error. Two 
possible applications are briefly indicated. 



1 Introduction 

Often in statistical applications heavy multiple testing is carried out leaving two major 
questions: 

Ql: Where are significant departures from null-hypotheses? 

Q2: What can be said about the overah error probability of the testing procedure? 

In regard to Q2, the classical approach is to control the familywise error, i.e., to require 
that the probability of any false rejection is < a , for some a fixed in advance. Such may 
be achieved using the Bonferroni inequality or, e.g., closed or sequential testing procedures 
(Marcus et al. 1976, Holm 1979). Particularly when the number of tested hypotheses is 
large, the desire to avoid any error of the first kind has to be paid by a low test power. 
Therefore, as an alternative it has been suggested to control instead the false discovery 
rate [FDR], i.e., to bound the expected proportion of false rejections among all rejections 
(Benjamini & Hochberg 1995). While test power generally is improved with this approach, 
it does not allow to pin down those tests for which the hypothesis can be safely rejected. 
Thus when using FDR control one only gets a vague answer to Ql. 

There are cases, however, where some tests have very small p-values, suggesting a 
massive violation of the null- hypothesis. Naturally then, one would like to be able to 
reject precisely those null-hypotheses with guaranteed confidence. A sequential multiple 
testing procedure designed for such cases has recently been proposed by Heinrich, Bach 
& Kornmeier (2008) under the name "Conquer and Divide" [CaD]. CaD proceeds by 
successively subdividing the "search space" and continues testing along each "search path" 
until first acceptance of a null- hypothesis, thereby taking advantage of instances where 
some of the individual tests' p -values are very small. 

The purpose of this short note is to develop a general, modified version of CaD (also 
called "CaD") and to prove that it controls the familywise error. This material appears 
in Section 2. Section 3 sketches two possible applications. An elaboration of this note is 
in progress. 



2 The testing procedure 

Consider a rooted tree with vertex set V. For definiteness, the tree is supposed to be 
"hanging downward," with the root vq on top. Each vertex v splits into its (imme- 
diate) descendants, imagined as lying one layer below v. Let d{v) C V denote the set of 
descendants of f , the number of which may differ across vertices. The splitting stops at 
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the L-th step {L > 1 ), such that the vertices of V come in L layers below the (0-th) root 
layer. In particular, the tree has depth L and is complete in the sense that all branches 
end at the bottom layer. 

With each vertex (a "location in search space") is associated a testing problem: at 
every v eV a test of a certain null-hypothesis Ho{v) is carried out whose probability of 
rejection under TCo{v) is < a{v) . Let us write a = a{vQ) for the test level at the root 
vq . The test levels are assumed to satisfy the following local Bonferroni condition. 

(LB) For every vertex v eV above the L-th layer one has J2v'€d{v) <^W) < a{v). 

The proposed multiple testing procedure by successive subdivision may now be described 
as follows. 

[CaD] Starting at the root vq , keep testing downward each branch of the tree ("search path") 
as long as the respective null-hypothesis is rejected: stop testing upon first acceptance of a 
null-hypothesis, and reject all null-hypotheses that have been rejected thus far. 

We will show that the testing procedure is valid, in the sense that its familywise error 
does not exceed a . The familywise error, or probability of an error of the first kind of 
the procedure CaD, equals the probability tti that among the hypotheses rejected by CaD 
there is at least one true (hence falsely rejected) hypothesis. 

Proposition 2.1 Under condition (LB) one has tti < a . 

Proof. Let P denote the probability measure underlying the observations. Given P, the 
hypothesis T~Lq{v) (about P) at vertex v is either true or false, independently of the 
experimental outcome. Thus given P, we get a valued tree by assigning vertex v the 
truth value t{v) = if Hq{v) is false, and t{v) = 1 otherwise. For any vertex v let 
U(v) denote the set of all vertices v' V that lie on the (imiquc) path leading from v 
up to Vq , except for v itself which is excluded. Let the set F consist of all vertices at 
which the null-hypothesis is true for the first time, 'first' in top-down direction. That is, 
F comprises all vertices v eV with the following two properties: (i) t{v') = for every 
v' e U{v); (ii) t{v) = 1. {F = {vo} if t{vo) = I.) 

The significance of the set F is the following: (*) if (the application of) CaD happens 
to produce any error of the first kind (hereafter: "type I error"), then it also produces a 
type I error at some vertex v e F. For suppose that CaD produces a type I error at vertex 
V* G V, say. If v* G F, we arc done. If v* ^ F, then since t{v*) = 1, there exists a first 
vertex v on the path from vq down to v* with t{v) = 1, that is, there exists v ^ U {v*)r]F. 
Moreover, the test at v rejects 'Ho{v) because otherwise the procedure would have stopped 
at V, leaving no occasion for a type I error to occur at v*. Consequently, a type I error 
occurs at v E F, and (*) is proven. But (*) implies 

TTi = P [Ho{v) is rejected for at least one v E F] (1) 
< ^^P[TCo{v) is rejected] 

whence it suffices to show that 

^^^^ «(.)<«. (2) 
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For any complete subtree U oi V let pn denote its root vertex. Then ([2]) is a consequence 
of the following more general claim: 

For every complete subtree U of V , Su '■= J2veFnu '^i''^) — '^{Pu) ■ (3) 

We argue by induction on the depth £ of U {0 < £ < L) . The case ^ = is trivial 
(since U then consists of its root only), so let 1 < £ {< L) and suppose that ^ holds 
for every complete subtree of depth £ — 1. Let U he a complete subtree of depth £. If 
F nU is empty or equals {pu} , there is nothing to prove. Otherwise let us decompose 
U : each descendant v of pu represents the root of a complete subtree U{v) of U of 
depth £ — 1. Since the vertex sets of all these subtrees are pairwise disjoint, and pu ^ F 
if F nU {pu} , the induction hypothesis and condition (LB) imply 

Su = y] Su(v) < V a{v) < a{pu). 

Thus ([3]) holds for any complete subtree of depth £ , and the inductive proof is complete. 

Remarks. The result immediately generalizes to the case where one has a collection 
of rooted trees, not necessarily with identical depths, provided the levels of the tests at 
the roots are controlled by Bonferroni. Note that the significance levels of the tests are 
moderate initially, and become restrictive only downward the tree. This is in contrast 
with other sequential procedures, e.g. Holm's (1979), where the most restrictive tests are 
carried out first. Note also that no assumption is required about the joint distribution of 
the test statistics. Finally, control of the familywise error implies that other common error 
criteria are controlled as well. In fact, domination by the familywise error is guaranteed 
for any criterion representable as the expected value of a (generally unobservable) random 
variable with values in [0, 1] that assumes the value whenever there is no false rejection. 
Examples include the false discovery rate and the per comparison error rate (Benjamini 
& Hochberg, 1995, p. 291). 

A further generalization of the CaD procedure deals with the case where a vertex v 
may, itself, represent a "local" multiple testing problem along with an associated testing 
procedure, M.{v) , rather than just the test of a single hypothesis, TLq{v) . The quantity 
a{v) then has to be interpreted as the familywise error of that testing procedure. 

For example, M{v) may stand for the situation where m = \d{v)\ null-hypotheses 
TCoiv'), v' € d{v) are tested using Holm's sequential testing procedure at the level a{v) 
(familywise). At the next layer, M.{v) splits into m descendants A4{v'), v' G d{v) , 
where A4{v') corresponds to a subdivision of the single hypothesis Tio{v') into a number 
of further null-hypotheses which, again, are tested using Holm's procedure. Any multiple 
testing procedure other than Holm's that controls the familywise error can be applied 
as well. The CaD procedure stops at vertex v if the local procedure associated with 
M{v) accepts at least one of the single hypotheses Tlo{v') . Otherwise it continues at all 
descendants M.{v'), v' G d{v) . The familywise error tti of the extended CaD procedure is 
defined as the probability that any of the local testing procedures A4(f), u G V produces 
a false rejection, which equals the probability that any of the single null-hypotheses 7io{v') 
is falsely rejected. 

Corollary 2.2 Under condition (LB) the extended CaD procedure described above satisfies 
vTi < a . 
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Proof. It suffices to assign truth values as follows: t{v) = 1 if any of the single hypotheses 
'Ho{v'), v' G d{v) is true, and t{v) = otherwise. The correspondingly defined set F 
then retains its original meaning: one readily verifies that if the extended CaD procedure 
produces a false rejection in the local testing problem M.{v*) , then there is a vertex 
V G F r\U{v*) such that the procedure produces a false rejection in the local testing 
problem M {v) . The remainder of the proof is analogous to that of the proposition. 

The definition of the extended CaD procedure is chosen such that the original proof 
carries over easily. Other variants may also be of interest. 

3 Two possible applications 

Analysis of EEG data. This is the area CaD was developed for by Heinrich et al. (2008). 
In electroencephalographic studies [EEG] one often wants to know where in a time series 
{x(t), t € T} "something conspicuous" is happening, that is, locate one (or several) time 
region(s) Cj C T showing distinct deviations from the behaviour to be expected under 
some null-hypothesis TCq. E.g., Hq may mean "no systematic departure from zero", 
E x{t) = for t £ T . With CaD, conspicuous regions are searched for by successively 
subdividing T into smaller intervals Cj down to a certain level, and testing Hq restricted 
to Cj along each subdivision path until first acceptance. Simulations carried out by 
Heinrich et al. (2008) suggested that CaD is conservative in the sense of Section 2, and 
revealed satisfactory power properties. 

Thresholding of wavelet coefficients. Nonparametric curve estimation based on 
thresholding of wavelet coefficients was introduced by Donoho & Johnstone (1994). As 
emphasized by Abramovich &; Bcnjamini (1995), thresholding may be regarded as a mul- 
tiple testing problem, where an estimated wavelet coefficient Wj^k is kept or set to zero, 
respectively, in accordance with the outcome of a test of the null-hypothesis that the 
"true" coefficient Wj^k = 0. In this context, the above testing procedure could be applied 
as follows. For n = 2"^+^ observations, the wavelet coefficients are grouped into resolution 
levels j = 1, . . . , J each comprising 2^ coefficients Wj^j-j k = 1, . . . ,2^ . They can thus be 
arranged as a binary tree in which coefficient Wj^k "splits" into ^yj+i,2fc-i and Wj^i^2k ■ 
This splitting corresponds to a halving of time intervals, as is most obvious for the Haar 
wavelet system. The CaD procedure applied with the tests of the hypotheses " Wj^k = " 
may then be regarded as a method of selecting thresholds for the estimated coefficients 
Wj^k ■ It differs from related proposals in the literature (e.g., Donoho & Johnstone (1994), 
Abramovich &; Benjamini (1995), or, for a different setting, Donoho &i Jin (2008)) in that 
the threshold is not the same for all coefficients (no matter how adaptive that global value 
is chosen), but increases with the resolution level. Useful implementations may require 
modifications of the tests at low resolution levels, in order to avoid too early stopping 
due to a possible "averaging out" of wavelet coefficients across longer intervals. The per- 
formance of the procedure can be studied along the lines of Abramovich &; Benjamini's 
(1995) article. 



4 



References 

Abramovich, F. 8z Benjamini, Y. (1995). Thresholding of wavelet coefficients as mul- 
tiple hypothesis testing procedure. In Wavelets and StafAstics, Ed. A. Antoniadis and 
G. Oppenheim, Lect. Notes Statist. Vol. 103, pp. 5-14. New York: Springer- Verlag. 

Benjamini, Y. & Hochberg, Y. (1995). Controlling the false discovery rate: A practical 
and powerful approach to multiple testing. J. R. Statist. Soc. B 57, 289-300. 

DONOHO, D. & Jin, J. (2008). Higher criticism thresholding: Optimal feature selection 
when useful features are rare and weak. Proc. Natl. Acad. Sci. (USA) 105, 14790- 
14795. 

Donoho, D. & Johnstone, I. M. (1994). Ideal spatial adaptation by wavelet shrinkage. 
Biometrika 81, 425-455. 

Heinrich, S. p.. Bach, M. & Kornmeier, J. (2008). Conquer and Divide: A novel 
approach to spatiotemporal significance testing that accounts for alpha error inflation. 
Neuroimage 41 Suppl. 1, p. S159. 

Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scand. J. Statist. 
6, 65-70. 

Marcus, R., Peritz, E. & Gabriel, K.R. (1976). On closed testing procedures with 
special reference to ordered analysis of variance. Biometrika 63, 655-660. 



Author addresses: 

" Institute for Frontier Areas of Psychology and Mental Health 
Wilhelmstr. 3a, 79098 Freiburg, Germany 

Department of Ophtalmology, University of Freiburg 
Killianstr. 5, 79106 Freiburg, Germany 

E-mail addresses: 

ehmOigpp . de 
kornmeierSigpp . de 

sven . heinrich@uniklinik-f reiburg . de 



5 



